AITopics | p-wasserstein distance

Collaborating Authors

p-wasserstein distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

Neural Information Processing SystemsApr-25-2026, 16:05:16 GMT

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

72da7fd6d1302c0a159f6436d01e9eb0-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 00:01:32 GMT

artificial intelligence, machine learning, target policy, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

Generalized Sliced Wasserstein Distances

Soheil Kolouri, Kimia Nadjahi, Umut Simsekli, Roland Badeau, Gustavo Rohde

Neural Information Processing SystemsAug-20-2025, 08:46:46 GMT

Neural Information Processing Systems http://nips.cc/

projection, radon transform, sliced-wasserstein distance, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Certifying Robustness via Topological Representations

Agerberg, Jens, Guidolin, Andrea, Martinelli, Andrea, Hoefgeest, Pepijn Roos, Eklund, David, Scolamiero, Martina

arXiv.org Machine LearningJan-18-2025

In machine learning, the ability to obtain data representations that capture underlying geometrical and topological structures of data spaces is crucial. A common approach in Topological Data Analysis to extract multi-scale intrinsic geometric properties of data is persistent homology (PH) (Carlsson, 2009). As a rich descriptor of geometry, PH has been used in machine learning pipelines in areas such as bioinformatics, neuroscience and material science (Dindin et al., 2020; Colombo et al., 2022; Lee et al., 2017). The key difference of PH compared to other methods in Geometric Deep Learning is perhaps the emphasis of theoretical stability results: PH is a Lipschitz function, with known Lipschitz constants, with respect to appropriate metrics on data and representation space (Cohen-Steiner et al., 2005; Skraba and Turner, 2020). However, composing the PH pipeline with a neural network presents challenges with respect to the stability of the representations thus learned: they may lose stability or the stability may become insignificant in practice in case PH representations are composed with neural networks that have large Lipschitz constants. Moreover, the constant of the neural network may be difficult to compute or to control. While robustness to noise of PH-machine learning pipelines has been studied empirically (Turkeš et al., 2021), we formulate the problem in the framework of adversarial learning and propose a neural network that can learn stable and discriminative geometric representations from persistence. Our contributions may be summarized as follows: We propose the Stable Rank Network (SRN), a neural network architecture taking PH as input, where the learned representations enjoy a Lipschitz property w.r.t.

artificial intelligence, machine learning, persistence diagram, (17 more...)

arXiv.org Machine Learning

2501.10876

Country: North America > Canada (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions

Raghvendra, Sharath, Shirzadian, Pouyan, Zhang, Kaiyi

arXiv.org Machine LearningJun-2-2024

The $2$-Wasserstein distance is sensitive to minor geometric differences between distributions, making it a very powerful dissimilarity metric. However, due to this sensitivity, a small outlier mass can also cause a significant increase in the $2$-Wasserstein distance between two similar distributions. Similarly, sampling discrepancy can cause the empirical $2$-Wasserstein distance on $n$ samples in $\mathbb{R}^2$ to converge to the true distance at a rate of $n^{-1/4}$, which is significantly slower than the rate of $n^{-1/2}$ for $1$-Wasserstein distance. We introduce a new family of distances parameterized by $k \ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance. We show that (1) $k$-RPW satisfies the metric properties, (2) $k$-RPW is robust to small outlier mass while retaining the sensitivity of $2$-Wasserstein distance to minor geometric differences, and (3) when $k$ is a constant, $k$-RPW distance between empirical distributions on $n$ samples in $\mathbb{R}^2$ converges to the true distance at a rate of $n^{-1/3}$, which is faster than the convergence rate of $n^{-1/4}$ for the $2$-Wasserstein distance. Using the partial $p$-Wasserstein distance, we extend our distance to any $p \in [1,\infty]$. By setting parameters $k$ or $p$ appropriately, we can reduce our distance to the total variation, $p$-Wasserstein, and the L\'evy-Prokhorov distances. Experiments show that our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.

dataset, new robust partial p-wasserstein-based metric, p-wasserstein distance, (12 more...)

arXiv.org Machine Learning

2405.03664

Country:

North America > United States > Virginia (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)

Add feedback

Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

Chen, Samantha, Wang, Yusu

arXiv.org Artificial IntelligenceNov-17-2023

Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).

approximation, neural network, wasserstein distance, (16 more...)

arXiv.org Artificial Intelligence

2308.00273

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Training generative models from privatized data

Reshetova, Daria, Chen, Wei-Ning, Özgür, Ayfer

arXiv.org Artificial IntelligenceJun-15-2023

Local differential privacy (LDP) is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GAN) on differentially privatized data. We show that entropic regularization of the Wasserstein distance -- a popular regularization method in the literature that has been often leveraged for its computational benefits -- can be used to denoise the data distribution when data is privatized by common additive noise mechanisms, such as Laplace and Gaussian. This combination uniquely enables the mitigation of both the regularization bias and the effects of privatization noise, thereby enhancing the overall efficacy of the model. We analyse the proposed method, provide sample complexity results and experimental evidence to support its efficacy.

artificial intelligence, machine learning, mechanism, (15 more...)

arXiv.org Artificial Intelligence

2306.09547

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Minimum Wasserstein Distance Estimator under Finite Location-scale Mixtures

Zhang, Qiong, Chen, Jiahua

arXiv.org Machine LearningJul-2-2021

When a population exhibits heterogeneity, we often model it via a finite mixture: decompose it into several different but homogeneous subpopulations. Contemporary practice favors learning the mixtures by maximizing the likelihood for statistical efficiency and the convenient EM-algorithm for numerical computation. Yet the maximum likelihood estimate (MLE) is not well defined for the most widely used finite normal mixture in particular and for finite location-scale mixture in general. We hence investigate feasible alternatives to MLE such as minimum distance estimators. Recently, the Wasserstein distance has drawn increased attention in the machine learning community. It has intuitive geometric interpretation and is successfully employed in many new applications. Do we gain anything by learning finite location-scale mixtures via a minimum Wasserstein distance estimator (MWDE)? This paper investigates this possibility in several respects. We find that the MWDE is consistent and derive a numerical solution under finite location-scale mixtures. We study its robustness against outliers and mild model mis-specifications. Our moderate scaled simulation study shows the MWDE suffers some efficiency loss against a penalized version of MLE in general without noticeable gain in robustness. We reaffirm the general superiority of the likelihood based learning strategies even for the non-regular finite location-scale mixtures.

finite location-scale mixture, location-scale mixture, mwde, (16 more...)

arXiv.org Machine Learning

2107.01323

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Sliced Iterative Generator

Dai, Biwei, Seljak, Uros

arXiv.org Machine LearningJul-1-2020

We introduce the Sliced Iterative Generator (SIG), an iterative generative model that is a Normalizing Flow (NF), but shares the advantages of Generative Adversarial Networks (GANs). The model is based on iterative Optimal Transport of a series of 1D slices through the data space, matching on each slice the probability distribution function (PDF) of the samples to the data. To improve the efficiency, the directions of the orthogonal slices are chosen to maximize the PDF difference between the generated samples and the data using Wasserstein distance at each iteration. A patch based approach is adopted to model the images in a hierarchical way, enabling the model to scale well to high dimensions. Unlike GANs, SIG has a NF structure and allows efficient likelihood evaluations that can be used in downstream tasks. We show that SIG is capable of generating realistic, high dimensional samples of images, achieving state of the art FID scores on MNIST and Fashion MNIST without any dimensionality reduction. It also has good Out of Distribution detection properties using the likelihood. To the best of our knowledge, SIG is the first iterative (greedy) deep learning algorithm that is competitive with the state of the art non-iterative generators in high dimensions. While SIG has a deep neural network architecture, the approach deviates significantly from the current deep learning paradigm, as it does not use concepts such as mini-batching, stochastic gradient descent, gradient back-propagation through deep layers, or non-convex loss function optimization. SIG is very insensitive to hyper-parameter tuning, making it a useful generator tool for ML experts and non-experts alike.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

2007.00674

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

(q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs

Mallasto, Anton, Frellsen, Jes, Boomsma, Wouter, Feragen, Aasa

arXiv.org Machine LearningFeb-10-2019

Generative Adversial Networks (GANs) have made a major impact in computer vision and machine learning as generative models. Wasserstein GANs (WGANs) brought Optimal Transport (OT) theory into GANs, by minimizing the $1$-Wasserstein distance between model and data distributions as their objective function. Since then, WGANs have gained considerable interest due to their stability and theoretical framework. We contribute to the WGAN literature by introducing the family of $(q,p)$-Wasserstein GANs, which allow the use of more general $p$-Wasserstein metrics for $p\geq 1$ in the GAN learning procedure. While the method is able to incorporate any cost function as the ground metric, we focus on studying the $l^q$ metrics for $q\geq 1$. This is a notable generalization as in the WGAN literature the OT distances are commonly based on the $l^2$ ground metric. We demonstrate the effect of different $p$-Wasserstein distances in two toy examples. Furthermore, we show that the ground metric does make a difference, by comparing different $(q,p)$ pairs on the MNIST and CIFAR-10 datasets. Our experiments demonstrate that changing the ground metric and $p$ can notably improve on the common $(q,p) = (2,1)$ case.

gan, ground metric, p-wasserstein distance, (14 more...)

arXiv.org Machine Learning

1902.03642

Country: Europe > Denmark > Capital Region > Copenhagen (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback